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Genetics in the Medical School Curriculum' 


C. NASH HERNDON 


Department of Medical Genetics, Bowman Gray School of Medicine of Wake Forest College, 
Winston-Salem, North Carolina 


MepIcINE Is both an art and a science. While accurate scientific knowledge is much 
preferred, physicians have never scorned empiricism. Genetics is the youngest of the 
biological sciences, yet physicians have been observing and recording the familial 
aggregation of certain diseases for centuries. The family physician of the 19th century 
often made surprisingly accurate predictions and early diagnoses, based largely on 
knowledge of the family history. Observations of this sort soon found their way into 
medical school textbooks and lectures. As evidence of the accuracy of some of these 
empiric observations, it is of interest to note that the early editions of Sir William 
Osler’s textbook of internal medicine, published 10 years before the rediscovery of 
Mendel’s work, carried an accurate description of the transmission of a sex-linked 
recessive trait (Osler, 1894). In his discussion of hemophilia, Osler pointed out that 
all daughters of a hemophiliac male are asymptomatic carriers of the condition, and 
also remarked on the frequent appearance of new cases in previously healthy families, 
which we now refer to as mutations. It was more than two decades before the cyto- 
logic basis for these empiric clinical observations became firmly established. It is also 
of historic interest that the applicability of Mendel’s rediscovered work to man was 
promptly recognized by certain workers, and that Garrod (1902) was able to show 
that alcaptonuria is due to an autosomal recessive gene as early as 1902. 

In the early days several workers attempted to apply the lessons learned from 
Drosophila and other laboratory animals to human problems, but knowledge accumu- 
lated slowly during the first quarter of the current century. It is well known that 
some of the studies of this period were poorly done and incorrectly interpreted, and 
did the cause of human genetics far more harm than good. The history of the develop- 
ment of research and the applications of genetic knowledge to medical problems was 
brilliantly reviewed by Snyder (1951) in his Presidential address to this Society as 
part of the Golden Jubilee Celebration of Genetics in 1950. 

The medical curriculum changes only slowly, and there was little if any organized 
instruction in genetics in American schools of medicine during the first three decades 
of this century. In the early 1930’s, serious proposals that genetics should be regu- 
larly taught as a basic science and as a clinical subject were made by Madge Macklin 
(1932), Laurence Snyder (1933) and William Allan (1936). These three should be 
regarded as the pioneers and prime movers in introducing genetics as a science to the 
schools of medicine. By means of their many formal and informal addresses to medical 
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groups, Macklin, Snyder and Allan continually emphasized the practical applica- 
tions of genetics to medical problems and urged the inclusion of genetics in the medi- 
cal curriculum. They were soon aided and abetted in their campaign by a number of 
research workers and educators. 

Macklin has remarked that her first teaching of genetics at the University of 
Western Ontario was as a “bootleg” addition to another course, which so proved its 
worth that a course in genetics was later established. So far as I am aware, the first 
regularly required course in medical genetics in the United States was organized by 
Snyder at Ohio State University in 1933. A few elective courses soon appeared in 
other medical schools, and several added occasional lectures on genetic topics as 
part of existing courses. In 1940 the Department of Medical Genetics at the Bowman 
Gray School of Medicine was organized by Allan, and Snyder gave the first group of 
lectures at this school, also giving the same lectures at Duke University and at the 
University of North Carolina. The publication of these lectures (Snyder, 1941) 
marks the appearance of the first book in this country suitable as a text for medical 
students. The establishment of the Heredity Clinic at the University of Michigan 
in 1941, the Dight Institute at the University of Minnesota in 1943 and the Labora- 
tory of Human Genetics at the University of Utah in 1945 greatly increased the 
awareness of developments in this field in the minds of medical educators. The Uni- 
versities of California, Oklahoma and Texas also became centers of influence, as well 
as Johns Hopkins and Tulane. The Department of Medical Genetics of the New 
York State Psychiatric Institute, under the direction of Dr. Franz Kallmann, has 
particularly influenced graduate medical training since 1935, and the cause of genet- 
ics in Canada has been advanced by the Departments of Medical Genetics in Toronto 
and Montreal. 

The first systematic attempt to obtain exact information on the extent of genetic 
instruction was made by Robertson and Haley in 1946. They sent questionnaires to 
84 schools of medicine in the United States and Canada, and received replies from 
60. A formal course in medical genetics was offered in 7 schools, with an average of 
15 class hours of instruction. Some lectures on genetics as part of other courses were 
given in 25 schools, with an average of 5 class hours of instruction. While the number 
of assigned class hours in most schools was certainly inadequate, it is encouraging 
that 38 per cent of medical schools assigned some time to genetics in 1946. 

A second questionnaire survey was done in 1953 with the approval of the Associa- 
tion of American Medical Colleges (Herndon, 1954). The findings of this survey were 
reported to the sixth annual meeting of this Society as a part of the symposium on 
human genetics and medical education organized by Dr. Macklin. By 1953, some 
instruction in genetics was being givenin55 per cent of the 87 medical colleges in the 
United States and Canada. While the increase in the number of colleges offering 
genetic instruction was gratifying, it was also noted that the total amount of genetic 
instruction in more than half of these schools was still less than 5 lecture hours. The 
great majority of instruction was given as a part of other courses, such as anatomy 
or pediatrics, and the number of separate courses in medical genetics had not in- 
creased in the preceding seven years. 

The comments received with these questionnaires and information obtained from 
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a number of other sources indicates that there has been a considerable increase in 
interest in genetics among medical educators within the past few years. Recent re- 
ports have indicated that the curriculum committees of at least a dozen schools of 
medicine are seriously considering the addition of courses.in genetics. One may also 
recall that Dean McEwen (1952) of the New York College of Medicine, who ad- 
dressed the fifth annual meeting of this Society, suggested that the teaching of 
genetics should be included in a proposed Department of Human Biology. This pro- 
posal was described in more detail by Sheehan and Harman (1953), and it may be 
pointed out that Cummins (1943) advocated a similar integrated approach to medi- 
cal education from the viewpoint of human biology ten years earlier. 

The most striking evidence of the interest of medical educators in genetics as a 
medical subject is provided by the 1954 Teaching Institute of the Association of 
American Medical Colleges. The Association is sponsoring a series of teaching insti- 
tutes devoted to discussion and exchange of information concerning curriculum con- 
tent, teaching methods and faculty and student problems, each institute covering a 
carefully chosen area of the general curriculum. Representatives from each medical 
school in the United States and Canada are present at these week-long meetings. The 
second Teaching Institute, held during October, 1954, was devoted to the subjects of 
pathology, microbiology, immunology and genetics. The full report of the 1954 
Teaching Institute will be published as a supplement to the September, 1955, issue 
of the Journal of Medical Education. As chairman of the subcommittee on genetics, 
Dr. James V. Neel was largely responsible for the planning of the genetic aspects of 
this Institute. As genetics was certainly the youngest and least established of the 
disciplines represented, the first evening of the Institute was devoted to a panel dis- 
cussion on ‘“‘Genetics in Medical Education.” The geneticists invited as panelists 
were James V. Neel, William C. Boyd, Bernard D. Davis, John B. Graham, Curt 
Stern and myself. I must admit that most of this group approached this meeting in 
a decidedly defensive frame of mind. We expected to be called upon to justify our 
very existence, and were prepared to attempt to “sell” genetics as a medical subject 
against heavy and determined opposition to any encroachments on the existing cur- 
riculum. As it turned out, we were tilting at windmills. Among the large group in 
attendance there was practically unanimous agreement that some knowledge of ge- 
netics would be quite useful to the practicing physician, and that the schools of 
medicine have an urgent responsibility to provide this training. In fact, the geneticists 
were gently chided for not having met this problem more vigorously in the past. The 
discussions during the next five days of the Teaching Institute did not question the 
advisability of teaching genetics in the medical school, but were concerned with 
what to teach, how to teach it, and who should do the teaching. 

The Teaching Institutes do not attempt to formulate policy or to recommend spe- 
cific curriculum changes or teaching methods to the constituent medical schools of 
the Association. Hence no ready made program or definitive plan of action is to be 
expected from the discussions of this group. It is recognized that each medical school 
has its own unique set of facilities, requirements and opportunities, and that each 
school must seek the solution that fits its own circumstances. 

Extended informal discussions tend to crystallize opinion, however. The majority 
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opinion at the Teaching Institute seemed to favor the establishment of a course in 
basic principles of genetics of 10 to 12 lecture hours at some time during the first two 
years of the medical course. Many felt that the genetics course should be correlated 
with instruction in pathology and microbiology. Additional instruction in the appli- 
cations of genetics to the clinical subjects is also necessary. While this could be ac- 
complished by a clinical course in hereditary diseases, the majority felt that this 
could be best accomplished by integrated teaching during the clinical years. Inte- 
grated teaching would visualize the geneticist as being present at certain lecture- 
clinics to discuss the genetic aspects of specific patients or groups of diseases being 
presented to the third or fourth year students by a clinician. Certain special lectures 
might also be given by the geneticist as part of existing courses, such as lectures on 
erythroblastosis or congenital malformations to the obstetrics class. A number of 
medical schools also feel the need for more instruction in biostatistics, and the possi- 
bility of utilizing a geneticist for instruction in this area was discussed. The group 
also felt that the geneticist should be available for clinical consultations with the 
staff and for genetic counseling service to patients and their families. The value of 
heredity clinics, which Professor Dice (1952) discussed in his presidential address to 
this Society in 1951, was generally recognized. The new book on genetic counseling 
by our president-elect, Dr. Reed (1955), should prove to be generally useful in con- 
nection with the needed clinical services. 

The opinions just expressed, which were summarized in a report of the highlights 
of the Teaching Institute given at the 1954 meeting of the Association of American 
Medical Colleges (Herndon, 1955), really sound most encouraging. They can be fairly 
presented as opinions held by a number of persons of responsibility in the administra- 
tion and curriculum planning of our schools of medicine. But there are also serious 
difficulties that arise. Two stumbling blocks in the path were particularly empha- 
sized in the discussions of the Teaching Institute. If a school that is now teaching no 
genetics should add to its staff a full-time medical geneticist to establish a full pro- 
gram, it would be necessary to provide a suitable salary and working facilities. These 
cost money, and all medical schools have their financial problems. The development 
of adequate programs of medical genetics in additional schools cannot proceed until 
the requisite funds become available from some source. While financing is a per- 
petual headache to deans and other administrative officers, this mere professor has 
faith that administrations will somehow find funds to support any teaching plan 
that demonstrates its value, usefulness and practicality. 

The second difficulty is in some respects even more serious, and is shared to some 
degree by all preclinical departments in the medical school. This is the problem of 
teacher recruitment. There are simply not enough well trained and available medical 
geneticists to provide staff members for all of the medical schools. Most of the people 
possessing adequate training in formal genetics and experience in teaching and ge- 
netic counseling are now productively employed. Their removal to medical schools 
would only create vacancies elsewhere. A new supply of well trained professional 
workers in human genetics is certainly needed. It seems likely that new positions 
will become available, and that the demand for teachers will exceed the supply. 

The participants in the Teaching Institute also discussed the desired qualifications 
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of instructors in genetics. Some felt that possession of an M.D. degree plus special 
training in genetics would be most desirable. This group urged that the teacher re- 
cruitment program should concentrate on urging interns and assistant residents to 
obtain special training in genetics. It is difficult at best for purely academic positions 
to compete with the glamour and financial promise of the private practice of medicine. 
The supply of teachers from this source is likely to be small indeed in the immediate 
future, but the widespread establishment of faculty positions in medical genetics 
may soon encourage young physicians to seek professional careers in this field. Others 
suggested that the best supply of teachers could be obtained from among those earn- 
ing a Ph.D. degree in human genetics. It seems certain that there is a place on the 
medical faculty for the geneticist without medical training. While the geneticist would 
necessarily be dependent on the clinician in the handling of clinical problems, this 
is not a serious handicap. While this source of teachers seems more promising, the 
visible supply still seems to be short of the probable demand of the next several years. 

It is thus apparent that the major obstacles to the establishment of additional 
genetics courses in schools of medicine are two in number: lack of funds and lack of 
teachers. While both problems would probably wear away gradually with time, the 
need is immediate and a prompt solution is much to be desired. An ingenious pro- 
posal to meet the requirements of the immediate future was made by James V. Neel 
during the discussions of the Teaching Institute. Neel pointed out that almost every 
medical faculty contains at least one man whose research or clinical interest has led 
him into some phase of the field of genetics, and who thus has an interest in the sub- 
ject. This man may be an ophthalmologist, an orthopedist or a pediatrician in one 
school, or an anatomist, embryologist or hematologist in another. These people al- 
ready hold established positions and are trained teachers. With a moderate amount 
of special training in basic genetic principles, these instructors could initiate new 
courses in elementary genetics with particular reference to clinical applications, and 
could hold the fort until an adequate supply of medical geneticists becomes available. 
Neel has suggested the possibility of organizing a summer work-shop designed to pro- 
vide these people with the tools necessary for teaching elementary genetics. While a 
work-shop could not be expected to turn out polished geneticists in a few weeks, a 
highly concentrated graduate course should be able to cover enough ground to meet 
the immediate needs of the schools of medicine. 

Neel’s proposal would require the use of the facilities of one of the larger universi- 
ties that already has an active program in human genetics. It would require a certain 
amount of financial support. As the medical schools would profit directly, some in- 
direct help in financing might possibly be expected from them. It is also possible 
that one or more of the major foundations might become interested in this project. 
Certainly the idea of training established teachers for an additional and needed spe- 
cific job should be an attractive one to foundations interested in the progress and im- 
provement of medical instruction. It should also be apparent that no one university 
now has a staff of geneticists that could furnish the manpower to carry out this pro- 
posal. Cooperation of several departments in several universities would be necessary 
to furnish the required staff of experienced and competent leaders for this work-shop. 

Neel’s proposal seems to be a well conceived and practical method for training the 
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instructors in genetics now needed by the schools of medicine. Its greatest weakness 
is that it would require a phenomenal amount of cooperation between various indi- 
viduals, universities, and other organizations. It would seem that a plan of this sort 
could be expected to receive widespread support only if it is sponsored and actively 
supported by a nation-wide organization with unquestioned altruistic objectives. The 
only organization with the required prestige and with the wide membership base to 
speak for human genetics as a scientific discipline is the American Society of Human 
Genetics. I therefore wish to propose that this Society adopt the policy of active par- 
ticipation in solving the problem of making genetic instruction available to all schools 
of medicine. I would propose the establishment of a Commission on Education in 
Medical Genetics, authorized to collect and coordinate information concerning the 
needs and requirements of the medical schools with regard to curriculum planning 
and the training and placement of staff members, and to take such action as may be 
necessary to meet existing requirements. Neel’s work-shop proposal described above 
should be referred to this Commission and exhaustively studied. It is entirely possible 
that further study might suggest an improved modification of this proposal that 
would better fit the existing circumstances. It might be possible to work out a long 
range plan of teacher recruitment and training, with a system of fellowships in medi- 
cal genetics. This organization could certainly serve as a useful clearing house for 
information concerning available positions and personnel. A Commission with these 
objectives would require the active support of all members of the Society, and should 
be free to call on any member for technical advice or special services. It would be able 
to cooperate with the Association of American Medical Colleges and with other or- 
ganizations concerned with medical education. This Commission should be able to 
make an outstanding contribution to the advancement of medical education and to 
the continuation of research in human genetics. 

We have seen that progress has been slow, but steady, in the introduction of ge- 
netics to the medical curriculum during the first half of this century. The progress 
made has been in large part due to the persistent urging of people of vision, such as 
Snyder, Macklin, Allan and others, who have repeatedly called attention to the value 
of genetics in understanding the pathogenesis of disease and its applications in early 
diagnosis and in prevention of disease. In my opinion a critical period has now been 
reached. Events of the past few years have aroused great interest in medical genetics 
among medica] faculties and the administrators of medical schools. They apparently 
see the need for genetics in the medical curriculum, and are seeking methods to satisfy 
these needs. If we allow this interest to die of frustration, the opportunity may not 
soon be repeated. I feel that it is the responsibility of the American Society of Human 
Genetics to seize this opportunity and to provide the technical assistance that the 
medical schools now desire. I have faith that this Society will regard the present 
situation as both a challenge and a trust, and that it will not fail to take wise and 
carefully considered action. 
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Maternal Age and Birth Order as Indices of 
Environmental Influence* 


THOMAS McKEOWN anp R. G. RECORD 


Department of Social Medicine, University of Birmingham, Birmingham, England 


As A MEANS OF EXPLORING the nature-nurture complex presented by most common 
diseases, examination of association between incidence and maternal age and birth 
order is likely to prove of great value. This is far from being a new idea, but there are 
two reasons why the examination has hitherto been less fruitful than might have been 
expected. The first is that the appropriate methods have had less careful scrutiny 
than their difficulties require, with the result that there are few conditions whose re- 
lationship to age or birth order is firmly established. The second is that an observed re- 
lationship has as a rule been regarded as the end rather than the beginning of enquiry. 
It is proposed to examine the methods used to establish association between incidence 
and age and birth order, and to illustrate the suggestive nature of the association by 
examples. 


METHODS 


If association between incidence of an abnormality! and maternal age and birth 
order is marked, it may be demonstrated by comparatively simple methods, or may 
indeed be strongly suggested by clinical impressions. The observations that infantile 
pyloric stenosis is more common in first than in later born, and that incidence of 
mongolism is raised in children of older mothers, were reported in the literature many 
years before they were confirmed by more elaborate methods. But in most abnormali- 
ties the effect of age and birth order is not so marked, and more thorough investiga- 
tion is required to establish association with the two variables, as well as to assess their 
relative importance. Two methods have been used. 


Comparison of Affected with a Related Population 


Distributions of affected by maternal age and birth rank are compared with 
that of the population of births from which they are drawn. This method is more 
useful in the case of conditions manifested at or shortly after birth, since at any 
considerable period after birth it is almost impossible to identify propositi with a 
related population. Unfortunately age and birth order are readily available only for 
very large populations (in England and Wales the Registrar General gives them for 
the country as a whole and for large regions) in respect of which it is usually out of 
the question to trace all or even a substantial proportion of affected. For the smaller 
areas in which affected can be traced, age and birth order of the related population 
are usually not given in local or national statistics, and must be established laboriously 


Received August 29, 1955. 
* Prepared from a paper contributed to the World Population Conference, Rome, 1954. 
! The term “abnormality” is used in reference to any physical or mental defect. 
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by examination of a representative sample. Subject to reservations referred to below, 
this method gives a reliable estimate of association between incidence of an abnormal- 
ity and maternal age and birth order, and has the advantage that it is possible by a 
simple process of standardization to separate the effects of the two variables. 


The Greenwood-Y ule Method 


When information about a related population of births is not available, and it 
usually is not in the case of conditions not manifested until some time after birth, it is 
possible to enquire whether affected are randomly distributed by maternal age or 
birth order within their own fraternities. The method, developed by Greenwood and 
Yule (1914), examines the hypothesis that within a fraternity containing one or more 
affected individuals all members are exposed to the same risk (equal to the number of 
affected members divided by the total number of individuals in the sibship). Ex- 
pected numbers of affected in each birth rank (or each maternal age group) are de- 
rived by summation and are compared with observed values. Its correct use may be 
illustrated by three examples. Fogh-Andersen (1942) investigated harelip and cleft 
palate in patients over 20; his conclusion that incidence is unrelated to birth rank was 
later confirmed by comparison of cases identified in infancy with a related population 
(MacMahon & McKeown, 1953). In an investigation of neurosis in persons aged more 


TABLE 1. GREENWOOD-YULE METHOD APPLIED TO A RANDOM SAMPLE OF BIRTHS IN BIRMINGHAM, 
ENGLAND, 1942-46 


Birth rank of propositus 
— Total 


Family size in 1948 
oe | 3 4/5 6 7 8 910 1 12 13 14 | 
_| 
2 49 | 107 156 
3 13 36 91 
4 1 | 7 20 33 61 
5 1 | 3 |5 12/18 39 
6 | — —|\610 16 
7 | 2% 6 
8 | 5 
9 -— 3 3 
10 
11 _ 1 2 
12 
13 | -—— - - 
14 | -—1 1 
Total: Observed (a) 64 | 153 13 | 50 380 
Expected (b) 136.1 19 136.1 19 85 904, 21.858 380 
Difference (a — b) —72.119|+16.881|+-27 .096! +28.142 — 
Variance of expected num- 80.929} 80.929) 51.486 13.922 
ber (V) 
a-—-b 
—8.02; +1.88 +3.78 +7.54 
VV 


Since the Greenwood-Yule method makes no use of them, data relating to one-child families 
have been excluded. 
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than 14 years, Norton (1952) found an increased incidence at birth ranks after ‘the 
third, and was in the unusual position of being able to confirm this result by a direct 
comparison of affected with a control group. Acceptable results were also obtained 
by Béék (1953) in a study of schizophrenia. Ne 

These three reports were restricted to patients whose fraternities were complete. 
If fraternities are not complete, errors are introduced because risks are assigned to 
sibs born before the period of enquiry but not to sibs born after the period. The 
error becomes more serious as the ratio of the number of sibships begun before the 
period of enquiry to the number completed within the period increases, that is 
as the time interval from which propositi are selected becomes shorter. This limitation 
of the Greenwood-Yule method is illustrated in Table 1. A random sample was!made 
from children born in Birmingham in the years 1942-46. Information about family 
size was obtained from mothers in 1948 when few fraternities were complete. The 
method does not give even distribution within the birth ranks, but shows a significant 
excess at the higher orders. 

This objection to the use of the Greenwood-Yule method when fraternities ‘are not 
all complete, and when propositi are selected from a limited period of time (as is 
usual when affected are obtained from institutional records, or from births in con- 
secutive years), does not arise if attention is restricted to sibships begun and com- 
pleted within the period in which propositi are identified. This procedure makes very 
limited use of the available data, however, and is open to the objection that fraterni- 
ties selected in this way are unrepresentative (because in practice fraternities can be 


TABLE 2. MODIFIED GREENWOOD-YULE METHOD APPLIED TO A RANDOM SAMPLE OF BIRTHS IN 
BIRMINGHAM, 1942-46 (RESTRICTED TO PROPOSITI AND SIBS BORN WITHIN THE PERIOD 1942-46) 


No. of births in each family Birth rank of propositus or sib 3 
during period 1942-46 3 
1 2 3 4 $678 FS BHR 
2 No. of propositi 35 59 20 19110 5 4— 3— 1—— 1/157 
No. of sibs 44 48 30 7— 4— 1—— 1 —)157 
3 No. of propositi 4 9 9 91/3 3—-————-—-- — 37 
No. of sibs 10 18 24 12;5 122—————-— 74 
| 
4 No. of propositi | 1 1 1 —|—-—— - 4 
No. of sibs — | 2 2 3);2—-——11%1—-——2 

Propositi: Observed (a) 40 69 58 31 |198 
Expected (b) 44.417) 63.250) 60.500 29.833 |198 

Difference (a) — (b) —4.417| +5.750) —2.500 +1.167 

Variance of expected num- | 23.048) 33.312) 18.027 4.653 | 

ber (V) | 
= 

=> —0.92 | +1.00 | —0.59 +0.54 


The method makes no use of data relating to propositi with no sibs born during the period; 
these fraternities have been excluded. 
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TABLE 3. DISTRIBUTION OF MALFORMATIONS OF THE CENTRAL NERVOUS SYSTEM BORN IN BIRMINGHAM, 


1940-47 
Birth rank 
Total 
1 2 3 4 5&6 7 & over 
Control group method 
No. affected 
Observed (a) 334 173 91 53 56 48 755 
Expected (b) 256.42 | 224.87 | 112.94 | 75.30 | 62.07 | 23.40] 755 
_— +4.10 | —3.03 | —1.64 | —2.05 | —0.58 | +2.99 
Greenwood-Y ule method (sibships 
incom plete) 
No. affected 
Observed* (a) 207 165 90 53 57 48 620 
Expected (b) 202.32 | 201.52 | 108.92 48.69 37.64 20.91 620 
= +0.42 | —3.32 | —2.18 | +0.71 | +3.70 | +7.54 
Modified method 
No. affected 
Observed (a) 181 130 77 33 45 | 32 498 
Expected (b) | 125.26 | 157.91 | 95.00] 46.83 | 40.83 | 32.15 | 498 
| 
— +6.66 | —2.97 | —2.37 | —2.57 | +1.00 | —0.07 


Birth rank is based on number of previous pregnancies, including abortions. 
* Includes 16 affected sibs of propositi. 


judged to be complete only if the mother has reached the end of the reproductive 
period, which means that small families of older mothers will be accepted, whereas 
small families of younger mothers will not). An alternative possibility is to consider 
only propositi and sibs born within the period from which propositi are selected. In 
Table 2 attention has been restricted to propositi and sibs born in the years 1942-46, 
which are distributed according to birth rank and number of pregnancies. For example 
a seventh born propositus who within the period had one sib before and none after, 
would be shown in the first row under birth rank 7, and the sib in the second row 
under birth rank 6, expectations of 0.5 (variance 0.25) being alloted in each case. 
This modified method gives reasonably good agreement between numbers observed 
and expected. Unfortunately when it is applied to a number of abnormalities whose 
distribution by age or birth order is known from comparison with related births, 
results are by no means equally good. In Tables 3 (malformations of the central 
nervous system), 4 (pyloric stenosis), 5 (patent ductus arteriosus) and 6 (mongolism), 
numbers observed and expected at each birth rank or (in the case of Table 6) maternal 
age group are given, using the Greenwood-Yule (incomplete sibships) and modified 
methods. When sibships are incomplete the Greenwood-Yule method fails to show 
the raised incidence of central nervous malformations, pyloric stenosis and patent 
ductus arteriosus in the first birth rank; it suggests that the incidence of pyloric 
stenosis may be raised in late birth ranks; and it gives a satisfactory result only in the 


TABLE 4. DISTRIBUTION OF PATIENTS WITH PYLORIC STENOSIS, BORN IN BIRMINGHAM, 1940-47 


S.E, | | 


| 


| Birth rank 
Total 
1 2 | 3&4 5S & over | 
Control group method 
No. affected | 
Observed (a) | 155 90 | 62 20 327 
Expected (b) | 111.06 97.39 | 81.53 37.02 | 327 
+4.18 | -0.75 | —2.13 | -2.64 
Greenwood-Yule method (sibships in- 
complete) 
No. affected 
Observed (a) 79 80 62 19 240 
Expected (b) 83.95 85.71 56.42 13.92 | 240 
| | | | 
0.67 | -+1.01 | 41.79 
Modified method | 
No. affected | | 
Observed (a) 56 48 | is | 173 
Expected (b) 51.58 | 60.58 | 46.26 | 14.58 173 
40.83 | -1.14 | +40.41 | 40.22 | 


Birth rank is based on number of previous pregnancies, including abortions. 
Fraternities containing more than one affected have been excluded. 


TABLE 5. DISTRIBUTION OF PATIENTS WITH PATENT DUCTUS ARTERIOSUS, BORN IN THE BIRMINGHAM 


REGION, 1936-52 


Birth rank 
1 2 3 & over 
Control group method | | | 
No. affected | | 
Observed (a) 82 35 | 33 
Expected (b) | $9.77 | 43.47 | 46.76 
a-—b | 
+3.71 —1.52 —2.43 
S.E. 
Greenwood-Yule method (siblings incom- | | 
plete) | 
No. affected 
Observed (a) 48 35 33 
Expected (b) 41.74 42.69 | 31.57 
a-b 
SE. +1.27 1.54 +0.37 
Modified method 
No. affected 
Observed (a) 47 34 24 
Expected (b) 36.74 38.23 30.03 
a-—-b 
SE. +2.25 0.91 | 1.80 


Birth rank is based on number of previous pregnancies, excluding abortions. 
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TABLE 6. DISTRIBUTION OF MONGOLS BORN IN BIRMINGHAM, 1942-52 


Maternal age 
Total 
| Under24| 25- | 30- | 40- | 45 & over 

Control group method | | 

No. affected | | 
Observed (a) | 14 | 36 | 40 59 | 60 | g 217 
Expected (b) | 61.38 | 66.64 | 51.25 | 28.34) 8.64) 0.75 | 217 

= —6.84 | —4.23 | —1.67 | +5.31 |+12.00 | +4.87 
Greenwood-Vule method (sibships | 
incomplete) 

No. affected | | 
Observed (a) 9 | 2 | 3 | 4 | 56 | 7 | 180 
Expected (b) 37.98 | 47.59 | 40.35 | 30.92 21.37 1.59} 180 
| | | 
oo —6.88 | —3.97 | —2.10 | +4.94 | +9.80 | +5.21 | 

| | | 
Vodified method | 

No. affected | | 
Expected (b) 11.37 | 21.20} 28.65! 25.78 | 16.50} 1.50 | 105 
a— | | 
—1.83 | +1.02 | -1.12 | +0.06 | +1.10 | +1.73 | 

.E. 


case of mongolism. The modified method demonstrates the primogeniture effect in 
two of the three examples (pyloric stenosis is the exception); does not exhibit the 
increased incidence of malformations of the central nervous system in late birth 
ranks (7 and over); and fails to show a significant age trend (chiefly because of small 
numbers) in mongolism. 

On this evidence the suggested modification appears to be almost as unsatisfactory 
as the Greenwood-Yule method when the latter is used without information about 
complete sibships. The limitations of the methods have been explored further on 
models, constructed on the assumption that women reproduce at a uniform rate and 
that completed fraternities contain 2, 3, 4 or 5 children, numbers of fraternities of 
each size being equal. It is further assumed that some children exhibit a condition 
which is rare and not familial, so that there is little chance of recurrence in a sibship. 
When equilibrium has been attained with respect to distribution by birth rank, 
affected individuals are marked on a chart of fraternities in accordance with pre- 
assigned risks (relative incidence) which vary with birth rank. Fraternities identified 
by the birth of an affected child within a given period (the time taken for a woman 
to have three children has been used) are assembled. Observed numbers of affected 
in each birth rank are compared with numbers expected if there were no association 
with birth rank, the difference being related to the standard error of the expected 
values. Expected numbers were derived in four ways: (1) by using the birth rank 
distribution of the related population, (2) by the Greenwood-Yule method applied 
to complete fraternities, (3) by the Greenwood-Yule method applied to incomplete 
fraternities, and (4) by the modified method in which sibs born before or after the 
period are ignored. The construction of a model is shown in Appendix I. 
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TABLE 7. RATIOS* DERIVED FROM POPULATION MODELS 


Birth rank 
1 2 3 4 5 
Model A (780 fraternities) 

Relative incidence 3 2 1 1 1 
Control group method +10.87/+1.36|—6.73)—5.26|—3.57 
Greenwood-Yule method (complete sibships) +9.99)+0.36)—6.50|—4.59|—2.90 
Greenwood-Yule method (incomplete sibships) —0.65|—0.65)—2.57|+3.73)+5.81 
Modified method 0.00 

Model B (540 fraternities) 

Relative incidence 1 1 1 2 3 
Control group method 
Greenwood-Yule method (complete sibships) —2.53)—2.53|—2.74|/+4.23/+6.78 
Greenwood-Yule method (incomplete sibships) 
Modified method 

Model C (570 fraternities) 

Relative incidence 2 1 1 1 2 
Control group method —2.56|+3.14 
Greenwood-Yule method (complete sibships) +7.17|—4.23}—3.14|—2.48)+3.11 
Greenwood-Yule method (incomplete sibships) —1.03|—4.94|—0.06/+-3 .49|+11.07 
Modified method 

Model D (780 fraternities) 

Relative incidence 1 2 3 2 | 1 
Control group method —3.57 
Greenwood-Yule method (complete sibships) —7.90|+1.68)+8.48 +0.64|—3.65 
Greenwood-Yule method (incomplete sibships) 
Modified method | —6.25)+1.21/+6.05 —0.83)—3.29 


* No. observed — no. expected if randomly distributed by birth rank 
standard error of expected values 7 


Results (Table 7) are uniformly good by the Greenwood-Yule method if sibships 
are complete, and are quite satisfactory by the modified method. When sibships are 
incomplete, the Greenwood-Yule method may fail to reveal a primogeniture effect 
(models A and C) and may falsely suggest a rise in late birth ranks (models A and 
D). The method appears to give acceptable results in conditions whose incidence 
rises at late birth ranks (model B). But the reason for this is not that the method is 
accurate in these circumstances, but that its error is in the right direction. The chief 
disadvantage of the use of the Greenwood-Yule method when sibships are incomplete 
is, however, that it may suggest an association when none exists (models A and D). 
The modified method, on the other hand, appears to be free from this risk, although 
it may fail to identify an association when it is present, as shown in the actual ex- 
amples of pyloric stenosis and mongolism (Tables 4 and 6). One reason why the 
method is unsatisfactory in these cases, although it is satisfactory on models is that 
it is very sensitive to changes in incidence arising from either a real secular variation, 
or from a change in the proportion of affected individuals identified in the different 
years covered by an enquiry. 

Since it is well recognized that the Greenwood-Yule method should be applied 
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only to completed fraternities it may be asked whether it is necessary to stress 
objections to its incorrect use. In practice, however, the method is frequently used 
when knowledge of sibships is incomplete. The following are examples of this. 

(i) Murphy and Mazer (1935) reported that the incidence of congenital defects 
increased in birth ranks after the fourth. The conclusion is unacceptable, since 
propositi were born in the years 1929-33, and data were assembled in 1934 when 
families were far from complete. The observation (Murphy, 1936) that mothers were 
older at the birth of a first affected than at the birth of a first normal child is of 
course quite misleading. 

(ii) Ingalls and Prindle (1949) reported that risk of oesophageal atresia increased 
with successive pregnancies. In every case the affected individuals were last-born, 
and families were evidently incomplete. Mean age of mothers at birth of propositi 
was said to be higher than in the related population of mothers, but recalculation of 
the standard error suggests that the difference is not significant. 

(iii) Polani and Campbell (1955) found the number of patients with congenital 
heart disease to be deficient if mothers were under 30, and excessive if mothers were 
aged 40-44. Sibships were considered to be complete, because mean family size 
(2.7) was approximately the same as that given by the Royal Commission on Popu- 
lation for couples married in 1925-9. This conclusion overlooks the fact that families 
selected by the presence of an affected member tend to be of greater than average 
size. If the incidence of congenital heart disease is 3 per 1000 (MacMahon, McKeown 
& Record, 1953) it can be shown (from data contained in Table C50 of the Papers of 
the Royal Commission on Population, 1954, Vol. VI, part 2) that families with one 
or more members affected resulting from fertile marriages in 1925-9 would have an 
expected mean size of 3.41 (see Appendix II). This suggests that the families exam- 
ined by Polani and Campbell were incomplete, and that the conclusion that the in- 
cidence of congenital heart disease is high at high maternal ages may be incorrect. 
MacMahon (1952), by a comparison between affected (mongols excluded) and related 
births, found no association with maternal age. 

(iv) Thompson (1951) reported that coeliac disease is more common in the higher 
birth ranks. The fact that mean family size was 2.1 suggests that many sibships were 
incomplete. 


Advantages and Disadvantages of the Two Methods 


Before commenting on the two methods we must try to clarify the purpose for 
which they are used. It would probably be agreed that however interesting investiga- 
tion of age and parity may be as an exercise in method, it cannot be regarded as an 
end in itself. We enquire whether a condition is more common at some ages or birth 
ranks than at others, for two reasons. In the first place, association between incidence 
and age or parity provides strong presumptive evidence of environmental influence; 
this is true even if the action of the influence is confined to the germ cells, leading to 
mutations in the ways referred to by Penrose (1955). Secondly, when considered in 
relation to clinical and pathological evidence the observations often suggest the 
direction of further enquiry. 

These considerations have a special relevance to the related subject of fertility. 
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With the Greenwood-Yule method, abnormal fertility of mothers of affected wil! 
disturb the association of incidence with age or parity only if fertility is altered after 
birth of a propositus. For example, if fewer children are born after a fourth born 
affected than after a fourth born unaffected (either because of a decision to restrict 
families, or because of a change in fertility after birth of an affected child) there will 
be an apparent increase in incidence of affected in late ages or birth ranks. (The 
reduction in fertility has the same effect as examination of incomplete sibships, 
referred to above.) If the disturbance of fertility is of any other kind (for example if 
mothers of affected are relatively infertile throughout the period of reproduction) it 
will have no influence on the association between incidence and age or parity. Without 
information about a related population we are unable to examine fertility. Internal 
comparisons, of the kind used by Murphy (1940) who compared intervals preceding 
births of propositi and their sibs, are of course quite misleading. This matter is 
discussed more fully elsewhere (Record and McKeown, 1950). 

With the other method, based on an external comparison, abnormal fertility 
before the birth of the propositus will affect the association of incidence with age or 
parity. So important is this consideration that examination of fertility is usually 
essential before firm conclusions are drawn. Fortunately the method by which the 
information about age and parity is obtained—interview with mothers of affected 
and of related births—makes it possible to obtain the data needed for investigation 
of fertility. 

In short, with the Greenwood-Yule method association of incidence with age and 
parity is usually not disturbed by fertility, but when it is we do not know of it; with 
the alternative method abnormal fertility may influence the association, but we are 
usually in a position to investigate fertility. When it is remembered that it is the 
possibility of uncovering the significance of such matters as fertility which makes 
investigation of age and birth rank worth pursuing, the fact that fertility must be 
examined can hardly be regarded as a disadvantage. 

A difficulty common to both methods is that material available to investigation 
may be neither complete nor representative. If the case of conditions manifested at 
birth (for example anencephalus) some affected are undoubtedly lost as abortions, 
and it is possible that the proportion lost is substantial. [Certainly abortion is 
common, and the proportion of aborted foetuses which are abnormal is probably 
high (Hertig, 1943)]. Abnormalities identified at a considerable period after birth 
may be even less complete if early mortality is high, as in mongolism. 

If the material, though incomplete, were representative, it would give a false 
estimate of incidence in each birth rank and maternal age group, but not of the 
relative significance of the two variables. It is quite likely, however, that individuals 
lost before or after birth are selected in respect of their distribution by maternal age 
and birth order. Certainly the stillbirth rate is sharply related to age and birth order, 
and so too, though for different reasons, is infant mortality. 

We may now summarize our conclusions about the two methods which are com- 
plementary. In the case of conditions manifested at or shortly after birth a comparison 
between affected and the total population of births (or a random sample of it) is the 
most satisfactory method. It is usually not practical to wait many years until families 
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of affected are complete, and when families are not complete the Greenwood-Yule 
method gives results which are quite unreliable. When conclusions are based on a 
comparison with a control population fertility should always be examined, and 
usually the data permit this. When the Greenwood-Yule method is used it should 
be remembered that if fertility is altered after birth of an affected (for example by a 
decision to limit subsequent reproduction) the association with age or birth order 
would be affected in the same way as by incomplete knowledge of sibships (viz., 
there will be an apparent increase in incidence at late ages or birth ranks.) Results of 
both methods may be influenced by prenatal mortality, and when affected are 
assembled some years after birth—as in the correct use of the Greenwood-Yule 
method——by post-natal mortality also. 


EXAMPLES OF THE USE OF MATERNAL AGE AND BIRTH ORDER 


Although there have been many observations on association between incidence of 
disease and maternal age and birth order, few attempts have been made to explore its 
aetiological significance. It is the special value of these observations that they not 
only point to environmental influence, but when considered in relation to other 
features of an abnormality, they suggest the direction of further enquiry. But the 
significance of age and birth order is not uniform, and only by taking account of 
clinical and pathological evidence can we make any useful deduction about aetiology. 
These points will be clearer when illustrated by examples of environmental influence 
suggested by consideration of age and birth order before, during and after birth. 


(a) The Pre-natal Environment 


It is generally accepted that the incidence of mongolism and anencephalus is 
associated with maternal age and birth rank respectively. The nature of these 
abnormalities indicates that they are established during the early weeks of gestation, 
and the evidence on maternal age and birth order directs attention to the uterine 
environment at this period. It is known that abortion is very common, particularly 
during the first third of pregnancy, and that a considerable proportion of abortions— 
half is a usual estimate—are self-induced. We know very little about the pathology 
of spontaneous abortions, or about the possibility that unsuccessful attempts to 
induce abortion may disturb the foetus or its blood supply. Further investigation of 
these matters may advance understanding of the aetiology of some malformations. 


(6) The Intra-natal Environment 


The ductus arteriosus normally closes shortly after birth, and there is experimental 
evidence that closure may depend upon adequate oxygenation of the blood (Kennedy, 
1942). Since difficulties of labour are more common in first than in later pregnancies, 
the observation that the incidence of patent ductus arteriosus is raised in first born 
(MacMahon, 1952) suggested the possibility that in some cases patency may be due 
to foetal asphyxia during and following delivery. The incidence of foetal distress was 
indeed substantially higher among affected than would be expected (Record & 
McKeown, 1953), and although it is not clear why transient respiratory embarrass- 
ment should result in persistent patency, the observation is suggestive, and it is 
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evidently worth while to explore further the significance of asphyxia in the aetiology 
of this condition. 


(c) The Post-natal Environment 


Unlike the uterine environment, the post-natal environment favours children in 
low birth ranks and at late maternal ages. This is because in so far as it affects the 
health of children the character of the environment after birth is largely determined 
by (a) the presence of older sibs, who may convey infectious disease, and (b) the 
economic resources of parents, which are greater in small than in large families, and 
at late than at early maternal ages. Again an example will show that a generalisation 
of this kind does not help us greatly to interpret the aetiology of a specific condition, 
unless we have regard for all the available information about it. 

Since the early years of this century it has been known that incidence of infantile 
hypertrophic pyloric stenosis was raised in first born. But since it was undecided 
whether the condition was present at birth, it was not known whether the environ- 
mental influence suggested by this observation was exerted before, during or after 
delivery. There are now reasons for believing that the tumour develops after birth: 
there was no evidence of abnormality at birth in the radiographs of 5 male infants 
who later developed pyloric stenosis (Wallgren, 1946), and size of tumour is highly 
correlated with age at operation (McKeown, MacMahon & Record, 1951). This does 
not exclude the possibility that the environmental influence is pre-natal, but this 
seems unlikely since (a) it is not until about the third week after birth that incidence 
in first born is significantly raised, and (b) symptoms appear earlier in domiciliary 
than in hospital births (McKeown, MacMahon & Record, 1952). The environment 
of the child during the first few weeks of life is relatively uncomplicated, and by 
taking into account the known features of infantile pyloric stenosis it may be possible 
to specify more precisely the nature of the adverse influence. 

These examples of environmental influence before, during and after birth illustrate 
three points about the use of maternal age and birth order. First, the examination is of 
very limited interest unless the significance of an observed association is explored 
further. Second, the same association may have very different explanations in 
different circumstances. And third, taken with other clinical and pathological 
evidence, information about age and birth order may suggest profitable lines of 
enquiry. It is this last point which justifies the hope that, when fully exploited, 
examination of maternal age and birth order may contribute materially to the un- 
ravelling of the nature-nurture complex presented by most common diseases of man. 


SUMMARY 


It is suggested that the two methods used to examine the association between 
incidence of an abnormality and maternal age and birth order are complementary. 
One of the methods relies upon a comparison between affected and a control popula- 
tion, and is appropriate in conditions manifested at or soon after birth, when related 
births can be identified. The other method (Greenwood-Yule) can be used in the 
case of conditions not manifested until a considerable period after birth when sib- 
ships are complete. (The error introduced when the Greenwood-Yule method is 
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applied—as it frequently is—to incomplete sibships is examined). The extent to 
which results of the two methods are affected by variation in fertility, and by differ- 
ential pre-natal and post-natal mortality, is discussed. 

Examples of the use of maternal age and birth order as indices of environmental 
influences are referred to. It is suggested that (a) these indices are of little interest 
unless their significance is explored further, (b) the same association may have differ- 
ent explanations in different circumstances, and (c) when taken with other clinical 
and pathological evidence, information about age and birth order may suggest 
profitable lines of enquiry. 
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APPENDIX I 
Illustration of the hypothetical population used in Model A 
It is supposed that women reproduce at a uniform rate and that completed fraternities contain 
2, 3, 4 or 5 children, numbers of fraternities of each size being equal. 
The pattern of reproduction may then be represented on a horizontal time scale, each fraternity 
occupying one line and each numeral representing the birth rank of the child, thus: 


1 2 | | 
1 2 3 | 
1 2 3 + 
1 2 3 4 5 
1 2 | 
1 2 3 | | 
12 31] 4 
1 2 3 + 5 
1 2 
1 2 | 3 
1 2 3 4 
1 sia 4 5 
2 
} | 2 3 
1 | 2 3 4 
3 3 + 5 
it 3 
1 2 3 | 
1 2 4 
1 2 3 + 5 
1 4 
1 2} 3 
1 4 5 
3 4 
i} 2 3 4 5 
If the incidence of the condition under investigation is: 
3 per 1000 among children in the first birth rank 
2 ” ” ” ” second’’ 
each of the remaining three birth ranks, 


fraternities identified by the presence of an affected born within the period limited by the two 
vertical lines would be as follows. (To simplify the model the possibility of a sibship containing 
two affected members has been ignored.) Affected individuals are shown in bold type. 
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APPENDIX I Concluded 


| 
| 


Thus 78 propositi are identified in a population of 42,000 births. 
Table 7 are based on a population 10 times greater. 
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APPENDIX II 
Calculation of Mean Family Size of Sibships containing a Case of Congenital Heart Disease 


Consideration is restricted to families resulting from marriages in the period 1925-29. The inci- 
dence of congenital heart disease is taken as 3 per 1000 births (i.e. p = 0.003). 


No. of live born 


Expected no. of families identified by 


children by 1946 No. of marriages 1— (i — p)*® presence of one or more affected* 
(s) (a) (e) 
0 20,003 0.00000 Nil 
1 31,583 0.00300 94.749 
2 31,513 0.00599 188.763 
3 17,535 0.00897 157.289 
4 8,898 0.01195 106.331 
5 4,836 0.01491 72.105 
6 2,650 0.01786 47.329 
7 1,359 0.02081 28.281 
8 675 0.02375 16.031 
9 301 0.02668 8.031 
10 98 0.02960 2.901 
11 43 0.03251 1.398 
12 15 0.03541 0.531 
13 7 0.03830 0.268 
14 3 0.04119 0.124 


Mean sibship size of fraternities containing one or more affected = 


Z(es) 2468.439 
724.131 


H.M.S.O., Vol. 6, Part 2, Table C50. 


Numbers of marriages are derived from Papers of the Royal Commission on Population, 1954, 


* It is assumed that the condition is not familial and is not associated with birth rank. 


= 3.41 
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Variances of Gene Frequency Estimates 


WILLIAM C. BOYD 


Boston University School of Medicine 


WHEN GENE FREQUENCIES are estimated by maximum likelihood methods (Stevens 
1938, Fisher 1946, 1947, Mather 1951, Boyd 1954a, 1954b) the variances of the esti- 
mates as a rule emerge as simple by-products of the approximation process involved, 
or can be obtained in a relatively simple manner from such calculations. However, 
it is often necessary to start with preliminary estimates made by other methods, 
and the question sometimes arises, are these preliminary estimates themselves suffi- 
ciently precise for a certain purpose? The efficiency of an estimate is defined as the ratio 
of its variance to that of the maximum likelihood estimate of the parameter (Stevens 
1938, Fisher 1950, Mather 1951). We can therefore answer the above question if we 
can obtain the variances of the crude estimates. 

The large sample variances of certain gene frequency estimates obtained in 
various simple ways have been published from time to time, but each such set seems 
to have been derived as a special case, and not always in the simplest manner. In 
particular, several authors (Wiener 1931, 1935, 1952; Steinberg ef a/. 1954) have 
arrived at the covariance of two estimates needed in an intermediate step in the 
calculations, by obtaining the correlation coefficient and multiplying by the product 
of the two standard deviations (as is recommended in certain sources, for example 
Arley and Buch 1950), rather than proceeding to find the covariance directly. This 
roundabout procedure is probably still being used because of the traditional authority 
of the correlation technique, for Fisher says of the correlation coefficient that no 
quantity has been more characteristic of biometrical work. Feller (1950), however, 
calls the correlation coefficient ‘‘a fancy way of writing” the covariance, which he 
treats as the primary concept. Methods of treating these problems do not seem to 
be included in all of the books on statistical methods which are being published today. 
A brief survey of the subject is offered by Cotterman (1954); and some of the early 
results are summarized (often without derivations) by Li (1955). Consequently it 
was felt that a systematic treatment of the subject, although embodying no new 
principle, might be of service to those who have occasion to estimate gene frequencies, 
in connection with blood grouping or otherwise. It is intended to give an exposition 
of the basic methods which will enable non-mathematicians to derive new formulas 
as needed. There does not seem to be any single source where such an explanation 
can be found; the necessary bits of information are scattered widely in various books 
and papers. The present paper derives mainly from the work of Bernstein (1930), 
who was the pioneer in applying mathematical methods to blood grouping problems; 
consequently Bernstein’s notation has been followed. 


Received August 13, 1955. 
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BASIC PRINCIPLES 


It might be well to start from first principles. If we have a set of observations of a 
variable A’, and let Aj represent the mean value (or “true” value) of A’, we may 
write A’ = Aj + x;. For a similar set of observations of B’ we may write B’ = 
B, + xo. Then from the definitions of variance and covariance, which will be found 
in any text book 


V(A’) = = E(x)? 


where V stands for the variance and E signifies the mean. If A’ and B’ are not in- 
dependent, we have 


CV(A’, B’) = E(xixs) 


where CV signifies the covariance, and the expression x;x: signifies the product of 
x; and x». 

Let us suppose our values A’, B’, etc. are the numbers observed in the various 
classes of an exhaustive and mutually exclusive classification. Call the total number 
of observations G. These classes now constitute a Bernoulli series (Arley and Buch 
1950, Cramér 1951, Rietz 1927), and we may make use of theorems which have been 
proved for such systems. Suppose we divide all the observed numbers by G and obtain 
relative frequencies A, B, C, etc. Then 


A+B+C+.:---=1., 
The variance of any one of these frequencies is known to be 


Ao(1 — Ao) 


V(A G 


where Ao represents the mean or “true” value of A. The covariance can be obtained 
as follows (Hotelling 1936): 


We know that V(A) = Ao(1 — Ao)/G, V(B) = Bo(1 — Bo)/G and, since A + B is 
a Bernoulli variate with probability Ay + By , V(A + B) = 


(Ao + Bo)[1 — (Ao + Bo)]/G 
But 
V(A + B) = V(A) + V(B) + 2CV(A,B) 
(from formula 2, below), or 
CV(A,B) = (1/2)[V(A + B) — V(A) — V(B)] 
= (1/2G) [—Ao(t — Ao) — Bo(1 — Bo) + (Ao + By)(1 — Ao — By)] 
= —AoB/G 
Similarly, 


CV(A,C) = — etc. 
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If we have any function f = f(x:, x.,-+-- ) of the deviations x;, x2, --- and fy 


stands for the mean of f, we may write, by Taylor’s theorem, neglecting second and 
higher powers of the (assumed small) deviations 


of of 
f — f) = () (x:) + (x2) + 


of 
f; = (=), etc. 


f — fo = + foxe + 


or, putting 


Therefore 
Vif fo) = E(f fo)? = E[fix; fix; eee + 


This procedure, apparently first applied to blood grouping problems by Bernstein 
(1930), is essentially what more recent writers call the “delta method”’. 


PREVIOUS RESULTS 


From this formula the variances and covariances of gene frequency estimates 
may be found. One of the commonest ways of estimating a gene frequency is as the 
square root of the frequency of a homozygous class. This is the method which is 
perforce applied to the estimation of the frequency of recessive genes in human popu- 
lations. It has also been used to estimate the frequencies of the genes in the M, N 
blood group system, and we may use this as an illustration. We have m’ = +/M, 
n’ = /N. Now M + MN + N = 1. Consegeuntly 


V(M) = M(1 M) V(N) = — N) CV(M, N) = —MXN 
G G G 
We write 
m’ = V(Mo+x:) and n’ = +x); 
therefore, at the point where x; , x2, etc. = zero, we have 
8x: (24/M)’ 
where f is defined as f = m’ + n’ = + xi + VNo +m 
Vif) = E(f — fo)? 
M,(1 — Mo) No(1 — No) = 2 Mo X No 
4M, G G XN, G 
(1 — Mo) + (i — No) 2~/(Mo X No) 


4G 4G 4G 


fy 
d 
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Letting m and n represent the true values of these gene frequencies, m + n = 1, we 
have at gene equilibrium 


Mo = m?, No = n? 
Therefore 


Since V(x + k) = V(x), where k is a constant (Arley and Buch 1950), this gives 
V(t — (m’ + n’)] = 1/4G, a formula derived by Wiener (1931) in a somewhat 
different way. We note that V(m) = (1 — m?)/4G and V(n) = (1 — n?)/4G, formu- 
las which were also derived by Wiener. 

The application of these methods is of course not limited to square root formulas. 
For instance, we may apply them to the “gene counting” estimates of the gene fre- 
quencies of M and N, where we write 


m’ = M+ MN/2 n’ = N + MN/2 
If we let M = Mo + X1, N= No +- Xo, and MN = (MN)>o +. X3 and 


MN 
f(x; X3) =M+xu+ 
we have f; = 1,f; = 14. Then 
V(M + MN/2) = fiV(Mo) + f3V[(MN)o] + 2f:fs;CV[Mo(MN)o] 


Mo(l — Mo)/G + (MN)o{1 — (MN)o]/4G — Mol(MN)ol/G 


Il 


If m and n represent the true values of these gene frequencies, My = m?, No = n?, 
and (MN)>» = 2mn, then 


V(m) = V(n) = mn/2G 


a formula derived by Wiener (1935) in a different way and given also by Stevens 
(1938). 

The gene counting estimates are also the maximum likelihood estimates, and 
consequently the variances can also be derived by the method of maximum likelihood, 
which in this case is simpler. This will serve asa general example of the derivation of 
variances by this method. The logarithmic likelihood function is 


= 2M log m+ MN logm + MN log n + 2N log n + const. 


where M, N, etc. are the numbers observed to fall into the respective classes, and 
m and n are the gene frequencies we desire to estimate. Then 


_ — 2Mn* — MNn? — MNm? — 2Nm? 


dm? m?n? 
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Putting M = Gm’, MN = 2Gmn, N = Gn’, we have 


Vim) = = mn/2G 


1 
— @L/dm? 
The first procedure outlined above was used by Bernstein (1930) to derive the 
variance of the expression D = 1 — (p’ + q’ + 1’), where p’, q’ and r’ represent 
the frequencies of the blood group genes A, B and O, estimated by Bernstein’s 
formulas 


Since O = r’, A = p? + 2pr, B = q? + 2rq, Bernstein wrote 
A+O= 
B+0= (q+r)’+ x 
O=r+x 


where p, q and r are the “true” values of p’, q’ and r’. He showed that p’ = 


=1- Vptr?tu, Vr + xs. He put 
f(x1, X2, = D = 1 — (p’ + q’ +r’). Therefore = 14(p +r), fo = 4(q + 0), 
fs = — lor, x1 = (A — Ao) + (O — Oo), x2 = (B — Bo) + (O — Op), so that 


E(x1x2) E[(A Ao)(B Bo) + (B = Bo)(O — Oo) 
+ (A — Ao)(O — Oo) + (O — Oo) (O — Oo)] 
CV(Ao , Bo) + CV(Bo , Oo) + CV(Ao , Oo) + V(Oo) 


— AoBo/G — ByOo/G — ApOo/G + Oo(1 — Oo)/G 
— (p+ + /G 


ll 


and similarly 


E(x;x;) = °/G — r(p + r)°/G 
E(xex,;) = °/G — r(q + r)?/G 


Therefore he found 


V(f — fo) 


E(fixi) + (faxs) + + 2fifexixe + +++ 
pq/2G(1 — p)(1 — q) 


which is the well known formula for the variance of D. For the individual estimates 
he found the variances 


a 
q’=1-VJYO+A 
r= 
( 
| 
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V(p’) = E(faxs) = [1 — (q + 
V(q') = E(fixt) = [1 — (p + 
V(r’) = E(fgx3) = (1 — r°)/4G 


Instead of the crude estimates p’, q’ and r’, it is now customary to use Bernstein’s 
adjustedestimatesp” = p’(1 + D/2),q” = q’(1+ D/2),r” = (r’ + D/2)(1 + D/2), 
which are very close to maximum likelihood estimates. Formulas for the variances 
of these adjusted estimates do not appear to have been published. However Neel 
and Schull (1954) point out that these estimates, although not precisely maximum 
likelihood estimates, are fully efficient, and consequently we may use the variances 
of the maximum likelihood estimates for them. Neel and Schull (see also Cotterman 
1954) give formulas for the elements of the information matrix in this case, and by 
inverting this matrix the variances and covariances are obtained. De Groot (1956) 
has recently derived explicit expressions for the variances and covariances in terms 
of p, q and r. 

In applying the method used by Bernstein we often have to calculate the variance 
of certain functions of frequencies with known variances and covariances. We need 
the following formulas, which are well known, but nevertheless not to be found in all 
the books on the subject. 


V(ax) = a’V(x), [1] 

where a is a constant 
V(x + y) = V(x) + V(y) + 2CV(x, y) [2] 
V(xy) = x*V(y) + y*V(x) + 2xyCV(x, y), [3] 


approximately where x and y are the two variables which are not independent, and 
CV(x, y) is their covariance. 

For instance, let us consider the recent calculations by Steinberg, Jones, Allen 
and Diamond (1954) of the frequency of the combination on one chromosome of 
the Rh factors C and f. On the assumption that all the anti-c and anti-e sera used 
contain anti-f also, they point out that the various genotypes CF/CF, CF/Cf, etc. 
would all be typed as either CC, Cc, or cc. Then they show that the frequency v of 
the combination Cf would be 


v=1-— V(cc) Vv (CC), 


and to find if the observed value is significant the variance of v is needed. They repre- 
sent the value of +/(cc) as c and of +/(CC) as u. For the variance of v they find 


Vw) = Vo + Vay — 28cSutcu 
where 


Ve. = Vo = = 


Su = VV); tou = 


30 WILLIAM C. BOYD 


From equation [2] we can write at once 


Vw = (1 — c)/4G + (1 — u®)/4G — 2cu/4G 
1 
= (c+ ul 


a simpler formula which gives the same numerical results as that used by Steinberg 
el al., if both are accurately calculated. 


NEW RESULTS 


We may apply these methods to obtain the variance of gene frequency estimates 
for which no variances have yet been reported. Consider for instance Wiener’s 
formulas for estimating p, q and r from ABO data 


p = VO+A-VO 


If we write = Vr + ‘Xs, P =V(ptr?+xu = 


V(q +1)? + x2 — Vr — xz where p, q and rare the “true” values, we have 


f f(x:, X3) 
—1 1 
2 2 2 
E(x1xs) = (O — Oo)(O — Oo + A — Ao) 
Vot+ CVa.0 
=r) _ 2pr) _ - (p+ 
G G G 
4pt+r? Geo. 
2 r 
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1 2 2 2r 
—r+i1-—(p+r) + 2p + 
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Similarly 


and of course 
vr) = 
4G 


as before. 

It will be seen from these formulas that the variances of the Wiener estimates are 
generally larger than those of the unadjusted Bernstein estimates. (They are of course 
also larger than those of the adjusted estimates.) For example, for a typical Asiatic 
population the variance of the Wiener estimate of p would be 0.531/4G and that of 
the Bernstein estimate about 0.510/4G. For a typical European population the 
variances of p would be 0.577/4G and 0.510/4G. For certain blood group distribu- 
tions, which are not however likely to be met with in actual populations, the variance 
of the Wiener estimate may be at least five times that of the Bernstein estimate. This 
difference in the variances of the two estimates does not seem to have been noticed 
by earlier workers. 

In exactly the same way we may obtain the estimates of the gene frequencies for 
the system: 


r = VO 

q = V0 

pp = VO + — VO 
VO+A, + 


We obtain 
, 1 2 2r 
V(p2) = = 


| 
Vipi) = pi- 


while V(q’) and V(t’) remain the same. 
Wiener (1954) has proposed to estimate the gene frequencies for the MNS system 
by the following formulas 


m, = (44)(.1/M + N+ MN + VM — VN) 
n, = (14)(./M + N+ MN — /M + VN) 


where M, N and MN are the classes negative for antigen S. 


via) = | 
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Since at genetic equilibrium we have the following relations (Boyd 1954a): 


MS = mg + 2m,ms 
MN = 2m,n, 
MNS = 2m,ng +- 2mgn, + 2mgng 
N=n 
NS = ng + 2mng 


it is clear that if we consider the tests with the anti-S serum alone the frequency of 
s (the S-negative gene) = ./ss = »1/M + N + MN. The proposed formulas are 
simply a way of adjusting the estimated frequencies of m, and n, to make them add 
up to s. 

Let 


M = Mo+ x1, N = No+ x2, M+ N-+ MN = W = Wo+ x; 
Then 
E(xi) = Mo(1 — Mo)/G = m,(1 — m;)/G 
E(x:) = — n5)/G, 
and E(x3) = s(1 — s°)/G 


E(x1x2) = CV(Mo No) 
E(x1Xx3) CV(Mo Wo) 


— 

E(M — Mo)((M — Mo) + (N — No) + (MN — MN),)| 
V(Mo) + CV(Mo, No) + CV(Mo, MNo) 

m3(1 — s°)/G 


Il 


and 
= ns(1 — s°)/G. 
Let 
x%2,%) = Vmi+x—-Vnitm+ Vs +x 
f, = f. = f; = 
Then 


E(f — fo)” = V(f) = Elfixt + + + 2fpfoxixe + 
V(f) = — + — nil/G + — 
— (24m,n,)(—msn3/G) + (24m,s)[m5(1 — s°)]/G — (34n,s)[n5(1 — s°)]/G 
= (4G)[1 — 4m; + 4m,/s] 
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Therefore 


Vim.) = — VN + VW)] = (4% 6G) [1 — + 4m,/s] 


The variance of the estimate of n, can be derived in the same way, and the 
variances of mg and ng with a little more difficulty. 

Finally, we may mention that by proceeding quite similarly to Bernstein in the 
above example, we can find the variances of the square root estimates of the Rh gene 
frequencies made by the method of Race, Mourant, and McFarlane (1946). The 
formulas are 


r= V(-—+--) 

+ + (F---) - 

= V(+—+-) + (4---) 

= V(+—4+-) + (4---) + (4-44) 
1-(r+R'+R" +R +R + Rz) 


=1-2 


where the plus and minus signs signify positive and negative reactions with antisera 
anti-C, anti-c, anti-D, and anti-E in that order. The derivation of these equations 
will be obvious from the genetic formulas 


++-—— =2R'r 
= R” 


+ Re 
+—+— = Ri + 2R,R’ 
+—++ = RZ + 2R2Ri+ 2R2R’ 


The procedure in each case is to add phenotypes until the result is equal to the 
expected result of squaring the sum of a certain number of the genes. Therefore, we 
have 


of 
e 
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r=VPtx 

R= 

R’ = VR’ VP 

Ro = V(Ro +r? +x VP +m 

R= VR +a t+ VP + x 

Rz = V(Rz + Ri + RB’)? + x6 — V(Ri + RY)? + xs 
+V(R' + ry? 
-V(R 


where we have here used bars under the letters to indicate the “‘true”’ values of the 


variables. Then, proceeding as did Bernstein, we find 
r—r= f(x) 


at the point where x; = 0. Omitting the now unnecessary bars, 


# 
dx; 

2 2 


G 


1rfa-r)_i-fr 
4° 4G 


R’ — R’ = f(x:, = V(R’ +r? Vr + xi 


V(r) = E(r — r)* = E(fix}) = 


1 1 
E(x)” = — r) 
E(x)? = - 


G 
— (+4+—-—)o+ (+---) — 


E(x1x2) 


+ + CV(-+--)(+---) 


the subscripts “0” indicating mean values. 
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r(i — r) r’(2R’r) rR” 


G G G 
fi — Ms — 
G 
G 
V(R’) = + f2(x2)” + 2frfe(xix2)] 
G 4(R’ + r)? G 
4r(R’ + r) G 
4G 4(R’ + 
= 4G 4(R’ + r)G 
1 


| 
2r/(R’ + 
In exactly the same way we find 


V(R”) = [2 — R” — 2r/(R” + 


V(Ry) = [2 — R} — 21/(Ry + DI 


The variances of R; and Rz and R¢z are a little more complicated, but turn out to be 


VR) = Ri = 1) RUB’ + NR + RY 


V(R,) = — RY — 2(R’ + 
where C = Rz + Ri + R’ 
V(R,) = — 2) = E — (C — — (Ry + R” + 


2r(Ro + R” + 1) 


-(2C + Ro + R”) (Ry 1) 
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DISCUSSION 


The methods outlined above can theoretically be used to derive the variance of 
the estimated frequency of any gene or chromosome, if the formula relating the 
desired frequency to the observed values is manageable. The formulas derived here 
are intended merely as examples. Formulas for the variances can also be derived by 
use of formula (C) in section 55 of Statistical Methods for Research Workers (Fisher 
1950, p. 309), although the labor involved does not appear to be any less, and the 
relationship of the results to the fundamental notions of variance and covariance 
does not seem quite so clear. Fisher’s formula has been used to check some of the 
new derivations presented here. 


ILLUSTRATIVE CALCULATIONS 


The usefulness of the formulas summarized and derived here is obvious in any case 
in which we wish to know the efficiency of a quick and easy method of estimation, or 
wish to set up confidence limits within which we expect the true frequencies to lie. 
We may illustrate some of the new formulas by an example from my compilation 
of blood group results up to the middle of 1938 (Boyd 1939), which is evidently still 
often referred to. When this compilation was being made, it was apparent that gene 
frequencies estimated from samples of less than 200 could not be very accurate, and 
I accordingly put all such results, along with others thought to be among the less 
reliable, in italics. However, not having formulas for the variances and standard 
deviations of the estimated frequencies, I was not then aware how great the errors 
might be. 

To have a specific and fairly typical example, let us consider the results of Mac- 
farlane on 160 members of the Mahishya caste in the area around Budge Budge, 
Bengal (Boyd 1939, p. 172). She found O = 32.5 per cent, A = 20.0, B = 394, 
AB = 8.1. Using a nomogram designed to give the results of Wiener’s formulas, | 
calculated the gene frequencies p = 0.154, q = 0.278, r = 0.571. The correct results 
of Wiener’s formulas for this population, to four decimal places, are 0.1545, 0.2777, 
0.5701. Use of the Bernstein formulas, which we now see to be preferable, would have 
given p = 0.1522, q = 0.2754, r = 0.5701, and the use of the efficient Bernstein 
adjustments, or the method of maximum likelihood, would have given 0.1524, 
0.2757, 0.5719. The relative efficiency of these different methods of estimation be- 
comes apparent if we compute the variances, using for Wiener’s estimates the 
formulas derived above, for Bernstein’s estimates the formulas derived by Bernstein, 
and for the method of maximum likelihood the formulas published by Neel and 
Schull (1954). We obtain V[p(W)| = 0.000621, V[p(B)] = 0.000440, V[p(ML)] = 
0.000439. Following Fisher’s definition of the efficiency, we find that the efficiency 
of the Wiener estimate is 439/621 = 70.7 per cent, and that of the Bernstein estimate 
is 99.8 per cent in this case. In terms of information regarding p, my use of the Wiener 
estimates was equivalent to throwing away the results of testing 47 of the 160 persons 
tested. 

If these facts had been known to me in 1938, I should certainly have used the 
Bernstein estimates throughout my compilation, and preferably have applied the 
Bernstein adjustments. It has since been shown that the Wiener estimates can also be 
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adjusted, and then become efficient (Boyd 1954c), but this was not known at the 
time. 

The standard errors of the various estimates of p are the square roots of the cor- 
responding variances, and are 


= 0.0249, Tp(B) = 0.0210, Op(ML) 0.0210. 


The 95 per cent confidence limits are p(W), 0.2012 — 0.1035; p(B), 0.1935 — 0.1113; 
p(ML), 0.1934 — 0.1113. The wide limits show the uncertainty of the estimates, 
and the somewhat more narrow limits for the Bernstein estimate illustrate the 
superiority of this method over the Wiener method. 

Similar results are obtained if the efficiency of simple square root estimates of 
gene frequencies for other systems is computed, using formulas such as those derived 
above and comparing results with the method of maximum likelihood. It turns out 
that in many, perhaps most, cases the loss of information attendant upon the use 
of square root methods is great enough to justify the use of maximum likelihood 
calculations instead. 

I am indebted to a number of colleagues who read and criticized this paper while 
it was in the course of preparatioa. The result has been to make it much less imper- 
fect, but these gentlemen are of course not responsible for any errors which may 
remain. I wish to extend my thanks to Dr. J. N. Spuhler, Dr. D. Lamphiear, Dr. W. 
J. Schull, and Prof. F. Mosteller. I am particularly indebted to the latter. 


SUMMARY 


Starting from first principles, methods of deriving the variances of blood group 
gene frequencies estimated by various methods are presented. The MN, MNS, 
ABO and Rh systems are considered. Variances are derived for the first time for 
Wiener’s estimates for the gene frequencies in the MNS system and for the much- 
used estimates of Rh gene frequencies by the method of Race, Mourant and Mc- 
Farlane. Illustrative calculations are given. 
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Efficiency of Gene Frequency Estimates for 
the ABO System 


MORRIS H. DEGROOT 


Committee on Statistics, University of Chicago 


SINCE THE MAXIMUM LIKELIHOOD gene frequency estimates for the ABO blood 
group system are efficient, explicit algebraic expressions for their covariance matrix 
may be useful in simplifying many statistical problems. One such problem is the 
evaluation of the efficiencies of alternative estimation procedures. We will derive 
concise expressions for the maximum likelihood variances and covariances and 
carry out a quantitative comparison of the efficiencies of two simpler estimation 
schemes, namely those of Bernstein (1925) and Wiener ef al. (1929). 

The following notation is employed throughout the paper. We denote the total 
number of observations in the sample by N and the observed proportions of indi- 
viduals of each type by O, A, B, and AB. Estimates of the gene frequencies, p, q, 
and r, are denoted by p’, q’, and r’. The letter V signifies the variance of an estimate, 
and the subscript B, W, or ML on V denotes the particular estimation procedure 
under consideration; e.g., Vp(p’) is the variance of the Bernstein estimate of p. We 
denote by CV(p’, q’), with appropriate subscript, the covariance of p’ and q’. 

All of the variances and covariances considered are understood to be large sample 
results. Boyd (1956) discusses the accuracy of these results for the sample sizes 
usually encountered in practice. It is also assumed throughout that neither p, q, 
nor r is zero; thus, these quantities are used freely as divisors. The special cases in 
which one of them is zero may be easily treated separately. 


THE COVARIANCE MATRIX OF MAXIMUM LIKELIHOOD ESTIMATES 


Stevens (1950) has derived the following formulas for the elements of the infor- 
mation matrix of the maximum likelihood estimates: 


1. = 4p — 6q + 6pq — 3pq') 
= p(2 — p — 2q)(2 — 2p — q) ' 


= 2N(4 6p — 4q + 6pq + 2p’ — 3p'q) 
q(2 — p— 2q)(2—2p—q) 


2N(4 — 4p — 4q + 3pq) 


(2 — p — 2q)(2 — 2p — q) 


As discussed by Neel and Schull (1954), the covariance matrix, which is the 


Inverse of the information matrix, can be obtained from these formulas by using 
the relations: 
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Vau(p’) = 
Vu.(q’) = I,,/A, 


CVmx(p’, q’) —I,,/4, 
where 


A = Ipplaa — 


The results of the above operations can be greatly simplified, and we obtain the 
following concise expressions: 


8N pqt+r 


Since, in the maximum likelihood method, r’ = 1 — p’ — q’, the rest of the co- 
variance matrix involving the estimate r’ is readily obtained from the formulas: 


= Vux(p’) + Vax(q’) + 2CVux(p’, q’), 
CVmux(p’, r’) = — CVmx(p’, q’), 
CVmx(q’, — CVmx(p’, q). 


It should be noted that the above expressions for the variances hold, not only 
for the maximum likelihood estimates, but for any other fully efficient set of estimates 
as well. 


COMPARISON OF THE ESTIMATES 


The Bernstein (1925) estimates of p, q, and r are given by 
p'=1-V0+4+B, 
q=1-VO+A, 
r= VO. 


Although these estimates are not efficient, Bernstein has given an adjustment 
which renders them equivalent to the maximum likelihood estimates in the sense 
that both are fully efficient (Bernstein 1930, Stevens 1938). In fact, since explicit 
expressions for the maximum likelihood estimates have not yet been obtained, these 
adjusted estimates and the set mentioned below seem to be the only readily available 
efficient ones. However, since we are concerned with the question of how much is 
lost in using the simpler estimates, it is unnecessary to consider the details of the 
adjustment here. 
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The variances of the unadjusted Bernstein estimates given above are 
Va(p’) = [1 — (q + r)*//4N, 
Va(q’) = [1 — (p + r)*//4N, 
Va(r’) = (1 — r*)/4N. 


Another simple estimation procedure has been proposed by Wiener, whose esti- 
mates of the gene frequencies are 


p = VO+A-— VO, 
= VO+ B- VO, 
r= VO 


Boyd (1956) gives for their variances the following: 


I 


Vw(q’) (2 q 
Vw(r’) = (1 r)/4N. 


An adjustment for these estimates, similar to the Bernstein adjustment, which 
renders them fully efficient also, has been derived by Boyd (1954). 

In the comparisons which follow, all statements concerning the estimates of the 
gene frequency p can be transformed into statements concerning the corresponding 
q-estimates simply by interchanging p and q wherever they appear in the ensuing 
discussion. This transformation is possible because of the symmetry in p and q of 
both the original model and the estimation procedures considered. 

Since the maximum likelihood p-estimate is efficient, information concerning the 
efficiency of the Bernstein p-estimate is found by examining the ratio Va(p’)/Vax(p’). 
Since this ratio is the reciprocal of the efficiency, large values reflect an inefficient 
estimate, whereas values close to 1 reflect an estimate of high efficiency. Forming 
the ratio and simplifying, we obtain 


That the second term on the right side of this equation is always positive reflects 
the fact that the Bernstein estimate is never fully efficient. We can, however, derive 
a simple upper bound for the ratio by dropping the last term in the denominator of 
the above expression. This gives 


V3 (p )/Vuax(p’) <i+ pq(4 — 3p) 
or 


Va(p’)/Vax(p’) < 1 + [p/(4 — 3p)]. 


= p+tr /4N, 
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It is now evident that as p takes values close to 1 the ratio approaches its maximum 
value of 2, whereas for p-values close to 0 it approaches the value 1. Thus, the 
Bernstein estimate is never less than fifty per cent efficient and, in fact, for small 
values of p it is almost fully efficient. 

Let us now turn to a comparison of the Wiener and Bernstein methods. Boyd 
(1956), in his discussion of the two procedures, has stated that “... the variances 
of the Wiener estimates are generally larger than those of the Bernstein estimates. . .” 
and that “...for certain blood group distributions, which are not, however, likely 
to be met with in actual populations, the variance of the Wiener estimate may be at 
least five times that of the Bernstein estimate.”’ These points are rigorously demon- 
strated by examining the ratio of Vw(p’) to Va(p’). After simplification, we have 


Vw(p’)/Va(p’) = 1+ 

Since the second term on the right side of this expression is positive for all values of 
p and q, it is clear that Vw(p’) is always greater than V,(p’). Furthermore, as q 
approaches the value 1, the ratio grows large without bound. On the other hand, 
the ratio is small (close to 1) when q is small. Therefore, since the Bernstein estimate 
is highly efficient only when p is small, and since Vy(p’) is close to Vg(p’) only 
when q is small, the Wiener estimate has high efficiency only when both p and q 
are small. 

As mentioned previously, an analogous discussion for the q-estimates could be 
given. The r-estimates are identical in the Bernstein and Wiener methods and, hence, 
so are their variances and efficiencies. The expression for their efficiency is 


Vur(t’)/Va(r’) 1 21 — +n 

Straightforward, but somewhat lengthy, manipulation reveals that this quantity is 
never smaller than 14 and approaches 1 as either p or q approaches 0. 

Numerical examples of these comparisons are given by Sukhatme (1942) and 
Boyd (1956). 

I am deeply indebted to Professors William C. Boyd and Frederick Mosteller for 
introducing me to these problems and for their valuable assistance in the preparation 
of this paper. 


SUMMARY 


Explicit expressions are obtained for the elements of the covariance matrix of 
efficient estimates. Using these expressions we have shown that for certain blood 
group distributions the unadjusted Bernstein estimates are highly efficient and that, 
in general, they are never less than 50 per cent efficient. The unadjusted Wiener 
estimates are shown to be never more efficient than the Bernstein estimates. 
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Penetrance Calculus in Population Genetics 


ARNE TRANKELL 
Department of Psychology, University of Goteborg 


INTRODUCTION 


IN AN EARLIER ARTICLE in this Journal (1955b) the application of population genetics 
to behavior traits has been discussed by the author. The basic idea was the construc- 
tion of mathematical models in which the influence of the environment and the 
measuring technique are taken into account. 

As left-handedness offers certain advantages in this context the inheritance of this 
trait was used as an illustration. The presentation was made on the assumption of 
monofactorial diallellia, and a theoretical model for incomplete penetrance of the 
recessive gene was shown to describe satisfactorily the inheritance of handedness in 
three different American investigations. 

Strictly speaking it is impossible to prove the validity of a hypothesis. The earlier 
article accordingly gives no final proof that handedness is inherited in the way stated. 
As long as no other model is shown to describe the empirical data as well or better, 
we have, however, good reason to believe that the model describes what is really hap- 
pening. It may be of a certain interest, therefore, to test the feasibility of other models 
that may be constructed. In this article the possibilities existing within the hypothesis 
of monofactorial diallellia will be expounded. 


THE NUMBER OF MODELS WITHIN MONOFACTORIAL DIALLELLIA 


Instead of the concept of dominance-recessivity, which is a special case of mono- 
factorial diallellia, we shall take as our point of departure the general assumption 
that we have to take into account two alleles, designated D and R, and two alterna- 
tive phenotypes, designated A and A (not A). No specific relation between the geno- 
types and the phenotypes is assumed, i.e. each of the three genotypes DD, DR and 
RR may produce both A and A. If the concept of penetrance is used for the portion 
of the carriers of a certain genotype manifesting the phenotype A (the penetrance may 
vary from genotype to genotype) the theoretical limits of the penetrance may be 
given as 0 indicating that no carrier of the genotype in question manifests the 
trait A, and 1 indicating that all members of this genotype manifest A. Penetrance 
values between these two limits imply that both phenotypes are possible among the 
representatives of the genotype in question. 

If the latter case is symbolized by + (this sign indicating that the penetrance is 
not tied to any of the two limiting values), monofactorial diallellia may be designated 
by (+ + +), where the three plus signs indicate the penetrance of the genotypes 
DD, DR and RR. The different special cases arise when the penetrance of one or 
more genotypes approach 0 or 1. The case (1 1 0) is thus a symbolic designation 
for dominant-recessive inheritance in the classic sense, where A is the dominant and 
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TABLE 1.—VARIANTS OF MONOFACTORAL DIALLELLIA 


Model DD | DR RR Remark 
I 1) 1 | | 2 Species typical traits 
2) | © | o | oO 
II 3) 1 | 0 | 0 | Dominant-recessive inheritance 
4) 0 | 0 | 1 
5) o;1/] 1 
6) 1 | 1 | 0 
Ill 7) 1 | 0 | 1 Intermediate inheritance* 
| o | 1 | | 
| 
IV 9) | + | 0 | Model for the inheritance of handedness (valid possibly 
10) | oO | | + | also for schizophrenia) 
11) | + | i | 
12) | 1 | + | 
v 13) | 4 | | © | Model possibly valid for dyslexia 
14) 0 1 | 
ak 
VI 15) 0 | O | Variant of intermediate inheritance 
16) | 1 | } 1 | 
| 
VII 17) Variant of intermediate inheritance 


Model possibly valid for dyslexia 


1 | 


+ 

| ++ | Variant of intermediate inheritance 
+ | 
+ 


G 27) | + | + | General model 


* Take, for instance, the course of inheritance among the Andalusian hens, which are pearl-grey 
heterozygotes of a pair of alleles, which in their homozygous combinations give rise to white or 
black hens, respectively. If we designate the grey color as A, and not grey as A (or the other way 
around), model III is usable. Intermediate inheritance of this type may of course also be looked 
upon as an example of monohybrid diallellia with three phenotypes. The models VI, VII and IX, 
may be considered more or less improbable variants of model III. 


A the recessive phenotype. The designation (0 0 1) refers to the same type of 
inheritance where A and A have changed places. 

Altogether 3° = 27 different models can be constructed. A close analysis of these 
reveals, however, that most of them are of little importance since they only express 
the same hereditary process in different ways. The number of independent alter- 


r 

18) + Oo; t 

19) 

20) | 1] 0 | 
vor 21) | + | + | O | es 
1 22) |; + 
- 23) + | + 
24) 1 | + 
; IX 25) | + | 0 

26) | + | 1 
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nates is ten, which means that nine sub-hypotheses referring to various special cases 
of monofactorial diallellia, can be formulated on basis of the general scheme 
(+ + +). These special cases have been arranged in Table 1 according to the 
degree of simplicity. 

In the earlier article the sub-hypothesis number IV was dealt with thoroughly and 
mathematical expressions were developed for the part of the descendants in different 
types of mating that can be expected to show the phenotype A. 

The formulae considered valid for the inheritance of specific dyslexia which were 
presented in the same article correspond to those of the sub-hypothesis number V. 

For each of the 27 cases in Table 1 it is possible to develop expressions that indicate 
how large a part of the filial generation within each type of mating can be expected 
to be of phenotype A. The most complicated formulae are those for the general model. 
Since it is possible to derive all formulae from the three equations of the general model, 
the construction of these will be described in detail. 


THE CONSTRUCTION OF THE EQUATIONS OF THE GENERAL MODEL 


1) The reproduction is assumed to be characterized by panmixia. 

2) The relative frequencies of the alleles D and R are designated d and r. 

3) The relative frequencies of the phenotypes A and A in the parental genera- 
tion are designated a and c. 

4) The frequency of D-gametes in the parental generation produced by A-indi- 
viduals is designated as m and the same frequency of R-gametes as n. Consequently 
the frequency of D-gametes in the parental generation produced by A-individuals is 
d — m and the same frequency of R-gametes r — n. 

5) From these frequencies of gametes in the parental generation the frequencies 
of different genotypes in the filial generation may be obtained for the family types 
(A x A), (A x A) and (A x A). These frequencies are to be found in Table 2. 

6) The relative frequencies of the phenotypes A and A in the filial generation are 
designated b and (1 — b). 

7) The frequency of A-individuals belonging to the genotype DD in the filial gen- 
eration is designated as h. For the genotype DR the same frequency is designated as 
i, and for the genotype RR as k. This means that h + i+ k = b. 

8) The proportion of the phenotype A among DD-individuals in the filial gen- 
eration may then be written as h/d?. The same proportion among the DR-individuals 
may be written as i/2dr, and among the RR-individuals as k/r’. 

9) By use of Table 2 and the expressions in the preceding paragraph we may now 


TABLE 2.—THE FREQUENCIES OF THE GENOTYPES IN THE FILIAL GENERATION 


Genotypes of offspring 
Family type | Total 


DD DR | RR 
AxA m? 2mn n? a? 
AxA | 2m(d — m) 2n(d — m) + 2m(r — n) | 2n(r—n) | 2ac 
AxA | (d — m)? | 2(d — m)(r — n) | (r—n)? | ¢ 
Total... | d? 2dr r | 1 
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TABLE 3,—THE PROPORTIONS OF PHENOTYPE A IN DIFFERENT SUB-HYPOTHESES WITHIN 
MONOFACTORIAL DIALLELLIA. 


Model AxA AxA AxA 
1 — 
| 
d d \ 
1/100 1 (;4) 
| r r ¥ 
ifr (5) 
d \ d 
1110) 1 ( 0 
i+r i+r 
1 1 
b b b (d-a\t 
4a? 2dr 2a? 4ac 2dr 2ac 4c? 2dr 2c? 
ee | b 1 b 1 b 4dr—1 
| b—d? (a—d*)? | b—d? (a—d*)(c—dr) |  (c—dr)? 


| i 2mn AX A) 
h m(d—m) + ey n(d—m)+m(r—n) 


ac 2dr ac 
A X A) 
d? 2dr 
| h m n h m(d—m) , n(r—n)|h (d—m)? , k (r-n)? 
d? a? ra? d? ac ac | d? c? r 


obtain an expression for the proportion of phenotype A among the children in marital 
relations of type (A x A). If this proportion is designated P, the following equation 
may be constructed: 


k 
a? Qdr at (1) 


10) If the same proportion for marital relations of type (A x A) is designated Po, 


the following equation may be constructed: 
{ 
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h md — m) i _ — m) + m(r n) k n(r 
a? ac 2dr ac r ac 


(2) 


11) If the same proportion for marital relations of type (A x A) is designated P,, 
the following equation may be constructed: 
h (d— m)? i 2(d — m)(r — n) 


k (r—n)? _ 
(3) 


THE EQUATIONS IN THE SUB-HYPOTHESES 


The three equations in paragraphs 9, 10 and 11 are the general equations for mono- 
factorial diallellia with two alternative phenotypes. The corresponding equations for 
the nine sub-hypotheses are obtained if the quantities m, n, h, i and k are given the 
values which correspond to the assumptions regarding the penetrance in these sub- 
hypotheses. In model II, variant (1 0 0), for instance, the quantities m and h are 
equal to d’, while n, i, and k equal zero. The equations will then have the following 
forms: P; = 1, Pz = d/(1 + d), Ps = d?/(1 + d)?, expressions which are well known 
(cf. Snyder, 1932, also see Stern, 1950, p. 167 and Li, 1955, p. 15). 

In Table 3, the expressions P;, P2 and P; are given for each of the nine sub-hypoth- 
eses. Only one formulation has been given for each sub-hypothesis, except in the case 
of model II, where the expressions for all four formulations have been given for the 
purpose of demonstration. As has already been pointed out the formulae in model II 
(and I) have been known for a long time. The remaining model with complete or zero 
penetrance in each genotype, i.e. number III, has also been discussed earlier (cf. 
Cotterman, 1953, p. 202). In the table it may be seen that the quantities m, n, h, i 
and k may be eliminated in all models where incomplete penetrance does not occur 
within more than one genotype. The proportion of A-individuals may then be 
expressed by the known quantities a (c) and b, and the unknown quantity d (r). The 
expressions in models IV and V are identical with those given in the article previously 
published. 

If the above is applied to the problem of the inheritance of handedness, the reader 
will have no difficulties in noticing the fact that of the models I-VII only number 
IV can be used for the description of the empirical material of handedness presented in 
the earlier article. Dr. Carl-Gustaf Berglin, Géteborg, has performed a systematic test 
of the hypotheses on this material, in a paper not yet published. He succeeded in 
testing model VIII by showing that this model (+ + 0) is improbable unless the 
penetrance in the group of heterozygotes approaches zero, i.e. when the model 
becomes identical with model IV. Model IX is not accessible to a test. 
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The Heritability of Certain Anthropometric 
Characters as Ascertained from 
Measurements of Twins 


PHILIP J. CLARK 
Institute of Human Biology, University of Michigan, Ann Arbor, Michigan 


INTRODUCTION 


VARIATION in some of the most interesting human attributes is of a nearly continuous 
nature and has not yet been associated with specific genes. For this reason there is 
sometimes disagreement concerning the relative importance of environmental and 
genetic factors in producing this variation. Estimates of the genetic component of 
the variation which are based on sib-sib and parent-offspring correlation or regression 
tend to be exaggerated by the fact that environmental factors are not constant be- 
tween families. The unique advantage of twins in overcoming this difficulty has been 
recognized since the time of Galton. Until recently, however, the number of recog- 
nizable and commonly variable genetic characters has been too small to be of much 
use in diagnosing zygosity. Partly for this reason and partly because of the expense of 
obtaining extensive data from series of twins not many comprehensive studies of 
heritability have been attempted with them. 

In 1952 the Institute of Human Biology, under the direction of Lee R. Dice, initi- 
ated the Hereditary Abilities Study with the purpose of investigating the heritability 
and interrelationships of a number of psychological, biochemical, and physical 
traits. The data collected by this study consist of numerous measurements from 
monozygous and like-sexed dizygous twins. The present paper, which is concerned 
with the heritability of the physical traits only, is one of a series of reports which will 
be based on these data. 


CHARACTERS STUDIED 


The anthropometric traits investigated were selected with the advice of J. N. 
Spuhler and are listed in Table I. The definitions of most of these characters and the 
methods employed in making the measurements are those of Martin (1928). The 
numbers by which Martin designated these traits are given in Table I. Martin does 
not describe bicondylar breadth of the arm, but this measurement is wholly analogous 
to bicondylar breadth of the leg. The dermatoglyphic traits—finger print pattern 
intensity and palmar main-line index—are described by Cummins and Midlo (1943). 
The data on birth weights were obtained from parent interviews. For bilateral 
characters the mean of the measurements from both sides was used as the value of the 
trait for each individual. 

Received July 30, 1955. 
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TABLE J. MARTIN NUMBERS AND ESTIMATES OF THE VARIANCE WITHIN LIKE-SEXED DIZYGOUS TWINS. 
2 2 
@p, THE VARIANCE WITHIN MONOZYGOUS TWINS, oy, AND OF HERITABILITY, h?, FOR VARIOUS 


ANTHROPOMETRIC TRAITS 


Traits 
Birth weight, oz 51.1 
Current weight, lbs 41.4 
Stature, mm 1 195.4 
Span, mm 17 317.7 
Sitting height, mm 23 130.7 
Bi-iliac breadth, mm 40 79.0 
Total arm length, mm 45 54.0 
Forearm length, mm 48 12.9 
Hand length, mm 49 3.9 
Middle finger length, mm 51 1.4 
Hand breadth, mm 52 a 
Bi-acromial breadth, mm 55 106.7 
Foot length, mm 58 10.9 
Chest circumference, mm 61 423.7 
Waist circumference, mm 62 978.8 
Neck circumference, mm 63 57.1 
Hip circumference, mm 64 6.0 
Midarm circumference, mm 65 1 
Forearm circumference, mm 66 
Wrist circumference, mm 67 
Bicondylar breadth—leg, mm 68(4) 
Bicondylar breadth—arm, mm 
Maximum calf circumference, mm 69 
Minimum ankle circumference, mm 70 
Head length, mm 1 


Head breadth, mm 3 
Minimum frontal breadth, mm 4 
Bi-zygomatic breadth, mm 6 
Bi-gonial breadth, mm 8 
Interpalpebral breadth, mm 9 


Bi-palpebral breadth, mm 10 
Interpupillary distance, mm 12 
Nose breadth, mm 13 
Head height, mm 15 
Total facial height, mm 18 
Upper facial height, mm 20 
Nose height, mm 21 
Ear height, mm 29 
Ear breadth, mm 30 
Head circumference, mm 45 


Cephalic module, mm 

Cephalic index X 100 
Cephalo-facial index X 100 
Total facial index X 100 
Relative shoulder breadth X 100 
Relative sitting height K 100 
Reciprocal ponderal index 
Fingerprint pattern intensity 
Palmar main-line index 


* Significant at the 5% level. 
** Significant at the 1% level. 


h? X 100 
35 
69** 
72** 
59** 
9o0** 
82** 
** 
31 
g1** 
61** r 
25 
** 
62** t 
53** Cc 
65** fi 
i 
54** \ 
54** 
14.9 4 
6.1 2 61** 
8.8 3 60** 

13.2 3 71** I 
4.5 1 60** t 
6.1 3 41* 

5.6 1 
3.2 1 66** 

27.7 8 69** 

20.0 5 74** \ 
11.4 3 72** 
9.3 
4.8 
2.9 | 

100.0 é 

38 
2.6 54** 
4.0 72** 

0.4 33 
0.3 
26.5 aa 
1.1 
3.0 
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The ratios studied are defined as follows: 
Cephalic index: head breadth/head length 
Cephalo-facial index: bi-zygomatic breadth/head breadth 
Cephalic module: (head height + head length + head breadth)/3 
Total facial index: total facial height/bi-zygomatic breadth 
Reciprocal ponderal index: stature/~\/ weight 
Relative shoulder breadth: bi-acromial breadth/stature 
Relative sitting height: sitting height/stature. 


TWINS 


The various measurements were taken on a series of 21 female and 23 male mono- 
zygous pairs and 23 female and 14 male, like-sexed dizygous pairs. Except for 3 pairs 
who were undergraduates in the University of Michigan, the twins were from high 
schools and junior high schools in Ann Arbor, Ypsilanti, Dearborn, and Detroit. They 
ranged in age from 12 to 20 years, the median age being 16. 

Twins were classified as dizygous if (1) they were discordant in either the ABO, 
MN, Rh, Kell, Duffy, or secretor reactions, the ABO and Rh types being based on 
the antisera A, absorbed A, and B and C, D, E, c, and e, respectively; if (2) the 
color or pattern of the iris of their eyes were conspicuously different; or if (3) their 
facial and head characters—including hair color—were markedly dissimilar. Other- 
wise they were classified as monozygous. A full discussion of the procedures employed 
in this study for diagnosing the zygosity of twins is given in another report (Sutton, 
Vandenberg, and Clark, in manuscript). 


HERITABILITY 


Heritability, designated by h*, may for our present purposes be defined as the 
proportion of the variance within like-sexed dizygous twin pairs which is attributable 
to genetic factors. It may be estimated by 


(1) = oD — OM 


oD 


where op and oy are the within-pair variances for dizygous and monozygous twins 
respectively. The significance of the difference between h? and zero is tested by the 


variance ratio F = ri , there being 37 and 44 degrees of freedom for the dizygous 
oM 


and monozygous variances respectively. Estimates of ob, om, and h? are given for 
each trait in Table I. In obtaining these estimates it was possible to combine the 
sexes and the various ages, since the within-pair variances could not be shown to 
differ in relation to either sex or age. 

The variance within dizygous twins exceeds that within monozygous pairs for 
every character included in this report, the difference being significant at the 5% 
level for every trait except birth weight, bi-acromial breadth, relative shoulder 
breadth, waist circumference, and cephalic index. The fact that birth weight was not 
found to be significantly heritable is in accord with the results of other studies, such 
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as those of Dahlberg (1926) and Taniguchi (1955). The low heritability of waist 
circumference reflects the influence on this character of such environmental factors as 
diet and exercise. The values of h? for weight, stature, sitting height, head length, and 
head width are comparable with those reported by Newman, Freeman, and Holzinger 
(1937). For cephalic index, however, these authors obtained h? = .75, a result appre- 
ciably higher than our value of .38. The work of von Verschuer (1927) is not directly 
comparable with that reported here, since his analyses did not involve variances; but 
the values of h? given in Table I are consistent with his computations. 


DISCUSSION 


The estimation of heritability from a comparison of identical and fraternal twins 
involves the assumption that the environments of the two members of a set of 
monozygous twins are, on the average, neither more nor less different from one an- 
other than are the environments of the two members of a pair of dizygous twins of 
like sex. This assumption has been discussed by Price (1950). There is some evidence 
that the postnatal environments of monozygous twins are more similar than are 
those of dizygous twins. But because of the possibility of inbalance in the mutual 
circulation of monochorial monozygous embryos, and of other related phenomena, 
it is probable that the prenatal environments of monozygous twins are less alike 
than are those of dizygous twins. Assuming, however, the validity of the premise, 
heritability may be estimated by comparing the variance within pairs of dizygous 
twins, op, with that within monozygous pairs, o4. The variance op contains a genetic, 
an environmental, and an error component, all of which are assumed to be additive, 
whereas ox contains the environmental and error components only. The proportion 
of the total variance within like-sexed dizygous twins which may be attributed to 
genetic factors is therefore given by formula (1), as was shown by Holzinger (1929). 
Holzinger’s expression, 


(2) Im — Ip 


is equivalent to (1) if 


(3) In 
and 

2 
(4) ip = i- 


where ry, and rp are the intraclass correlations for monozygous and dizygous twins, 


respectively, and V is the total variance for all twins, both monozygous and dizygous. 
But if 
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and 


(6) 


where the variances Vy and Vp are computed separately for monozygous and di- 
zygous twins respectively, then formula (2) is equivalent to 


Since formula (7) is sensitive to sex, age and other between pair differences in the 
composition of the monozygous and dizygous samples, whereas (1) is insensitive to 
all such differences as do not affect the variances within pairs, we have employed 
(1) rather than (7) in our estimates of h?. 

It should be observed that the nongenetic component of the variation within 
dizygous twins, assumed to be estimated by om, cannot be exactly equated with the 
environmental component since it also contains a component due to errors in measure- 
ment. For most of the anthropometric traits these errors are small, so that 1—h? 
approximates the proportion of the variance attributable to environmental factors. 

The statistic h? is an estimate, not of the extent to which a trait is genetically de- 
termined, but of the proportion of the variation in the trait which is genetically 
determined. If all of the genetic factors responsible for a character are identical in 
every individual in some population, the genetic component of the variance will be 
zero in that population—even if the genetic factors almost completely determine the 
character. Furthermore, h? is applicable only to the population from which it is 
derived. It is quite possible for a character that is heritable in one population not to 
be heritable in another, or for a trait that is not heritable among individuals at one 
age to be highly so among the same individuals at a later age. 

Heritability, as measured by h’, is nevertheless of considerable evolutionary interest, 
for it is an index of the susceptibility of a character to genetic change. That a character 
may be heritable at one age and not at another is a corollary of the fact that evolution 
may take place in one stage of the life cycle without affecting other stages. The 
possibility of differences in the heritability of a given character between populations 
is related to the fact that not all populations are equally sensitive to evolutionary 
change. A high value of h® does not necessarily indicate that a trait is undergoing 
rapid evolution, but the greater the value of h* for any character the greater may be 
the evolutionary effect of differential fertility with respect to that character. 

The extent to which variation in anthropometric characters affects reproduction is 
unknown. The population sampled by this study, however, exhibits a considerable 
degree of genetically determined variability in these features. Should any of these 
anthropometric characters be correlated with fertility to an appreciable extent, evolu- 
tionary change in the characters concerned would be expected to occur. 


SUMMARY 


The genetic fraction, h*, of the variance within pairs of like-sexed dizygous twins 
Was estimated for a number of anthropometric characters. The greater part of the 
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variance of most of these traits was found to be genetically determined. If such 
characters were correlated with fertility they might be expected to undergo evolu- 
tionary change. 
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Correction 


Formula (2) in Dr. Deraemaeker’s Letter to the Editor (Vol. 7(4):443) was printed incorrectly. 
It should read 
k = 


c 
16q(1 — c) 
T+ 18q 


